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AMENDMENTS TO THE CLAIMS 

1-2. (CANCELLED) 

3. (CURRENTLY AMENDED) A computer-implemented method of constructing a 
model for predicting molecular behavior using marker molecules, said method comprising: 

classifying respective molecules in an training set of reference molecules as either 

possessing or not possessing at least one chemical or biological property; 

selectin g, from said training set, a plurality of molecules that possess said at least 

one chemical or biological property as target molecules for potential selection as marker 

molecules for said model; a first subset of said training sot of r e ferenc e mol e cul e s, 

wh e r e in all of th e mol e cul e s in said subs e t poss e ss th e at l e ast on e prop e rty ; 

selecting some of said target molecules as marker molecules for said model by 

evaluating the predictive accuracy of said potential marker molecules, wherein said 

evaluating comprises: 

comparing all mol e cul e s in said training s e t with all other mol e cul e s in 
said training s e t in accordanc e with computing a numerical value defining a 
measure of molecular structural similarity for each pair of molecules in said 
training set using a pre-defined structural similarity metric; 

selecting one of said target molecules (T) a targ e t mol e cul e from said first 

subs e t ; 

sorting all training set molecules in descending order of structural 
similarity to molecule T as defined by the computed numerical values; 

defining, for a first one of said sorted training set molecules (M) a first 
mol e cul e in said training s e t oth e r than said targ e t mol e cul e , a fractions-correctly- 
predicted metric as a ratio of A/B, wherein B is defined as the total number of 
training set molecules that have a computed numerical structural similarity to 
molecule T that is as large or larger than the computed numerical structural 
similarity between molecules T and M, and wherein A is defined as the number of 
training set molecules that both (1) have a computed numerical structural 
similarity with molecule T that is as large or larger than the computed numerical 
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structural similarity between molecules T and M. and (2) possess the at least one 
chemical or biological property as th e numb e r of molecules in said training s e t 
that are m e mb e rs of said first subset and that have a structural similarity to said 
target molecul e at l e ast as great as said first mol e cul e 's structural similarity to said 
target mol e cul e divid e d by th e total number of mol e cules in said training s e t 
having a structural similarity to said target molecule at least as gr e at as said e ach 
other molecul e 's structural similarity to said target mol e cul e; 

repeating the defining step for each molecul e other sorted training set 
molecules in said training sot oth e r than said target mol e cul e; 

determining, from molecul e s in said training s e t having a fractions 
correctly pr e dicted metric below a thr e shold value, which mol e cule has th e 
highest structural similarity to said target mol e cul e ; 

counting the number of mol e cules in said training s e t having a high e r 
structural similarity to said targ e t mol e cul e than said mol e cule d e t e rmin e d in said 
determining st e p; 

choosing said targ e t mol e cul e molecule T as a marker molecule if said 
number B and said ratio A/B are both above respective threshold values when 
computed during at least one of said defining steps; of mol e cules d e t e rmin e d in 
the counting step is e qual to or gr e ater than a pr e select e d valu e; and 
outputting data indicating that said targ e t mol e cul e molecule T has been chosen as 
a marker molecule. 

4. (CANCELED) 

5. (CANCELED) 

6. (CURENTLY AMENDED) The method of Claim 3, additionally comprising 
repeating said d e t e rmining and counting st e ps choosing for a plurality of different threshold 
values. 

7. (CURRENTLY AMENDED) The method of Claim 3, comprising repeating said 
selecting a target molecule, sorting, defining, r e peating, d e termining, counting, and choosing 
steps for other molecules of said first subs e t that possess the at least one chemical or biological 
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property at a plurality of different threshold values and pr e s e lect e d numb e r of molecul e s valu e 
so as to select a plurality of preliminary sets of marker molecules. 

8. (PREVIOUSLY PRESENTED) The method of Claim 7, comprising 
choosing a final set of marker molecules by making molecular behavior predictions for all 
molecules in said training set using each one of said preliminary sets of marker molecules, and 
choosing as said final set of marker molecules the preliminary set that most accurately predicts 
molecular behavior of molecules of said training set. 

9-18. (CANCELED) 

19. (NEW) The method of Claim 3, wherein said threshold for B is 5, and said 
threshold for A/B is 1 . 



